Java Tools - jstack

1. Usage

jstack是用来打印Stack trace的一个工具. 由JDK提供, 在$JAVA_HOME/bin/目录下. 对于一个Java进程来说, 可以使用jps, 或者ps命令获取进程ID. 假设进程ID为9876, 则下面的命令就可以dump出相关的线程信息:

1
jstack 9876

2. Explanation

举个栗子, 执行如下代码:

1
2
3
4
5
6
7
public class Test {

public static void main(String[] args) throws InterruptedException {
Thread.sleep(1000000);
}

}

查看端口 jps:

1
2
3
4
5
6
13120 Test
13122 Jps
10674 bootloader.jar
9042 ZooKeeperMain
7076
12727 Main

Dump线程信息jstack 13120:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.121-b13 mixed mode):

"Attach Listener" #9 daemon prio=9 os_prio=31 tid=0x00007fe05d850000 nid=0x1007 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Service Thread" #8 daemon prio=9 os_prio=31 tid=0x00007fe05d042800 nid=0x4703 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"C1 CompilerThread2" #7 daemon prio=9 os_prio=31 tid=0x00007fe05d039800 nid=0x4503 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" #6 daemon prio=9 os_prio=31 tid=0x00007fe05d038800 nid=0x4303 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #5 daemon prio=9 os_prio=31 tid=0x00007fe05d020000 nid=0x4103 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" #4 daemon prio=9 os_prio=31 tid=0x00007fe05d01f000 nid=0x3f0b runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Finalizer" #3 daemon prio=8 os_prio=31 tid=0x00007fe05d012800 nid=0x3203 in Object.wait() [0x000070000ca60000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000795588ec8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x0000000795588ec8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

"Reference Handler" #2 daemon prio=10 os_prio=31 tid=0x00007fe05c822800 nid=0x3003 in Object.wait() [0x000070000c95d000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000795586b68> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x0000000795586b68> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

"main" #1 prio=5 os_prio=31 tid=0x00007fe05d003800 nid=0x1a03 waiting on condition [0x000070000c34b000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at test.Test.main(Test.java:6)

"VM Thread" os_prio=31 tid=0x00007fe05d809000 nid=0x2e03 runnable

"GC task thread#0 (ParallelGC)" os_prio=31 tid=0x00007fe05c800800 nid=0x2607 runnable

"GC task thread#1 (ParallelGC)" os_prio=31 tid=0x00007fe05c808800 nid=0x2803 runnable

"GC task thread#2 (ParallelGC)" os_prio=31 tid=0x00007fe05c809000 nid=0x2a03 runnable

"GC task thread#3 (ParallelGC)" os_prio=31 tid=0x00007fe05c80a000 nid=0x2c03 runnable

"VM Periodic Task Thread" os_prio=31 tid=0x00007fe05d015000 nid=0x4903 waiting on condition

JNI global references: 6

注意, JVM仅仅启动一个用户线程. 即main线程. 除此之外都是JVM启动的线程, 可以大致看一下方便排错时排除. 以main线程为例:

1
2
3
4
"main" #1 prio=5 os_prio=31 tid=0x00007fe05d003800 nid=0x1a03 waiting on condition [0x000070000c34b000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at test.Test.main(Test.java:6)

可以看到都是调用的stack信息, 主要信息都在前两行.

2.1 Java Thread Name & ID

"main" #1分别表示Java线程的名称和ID. 在Java代码中可以通过Thread#getName, Thead#getId获取.

2.2 线程优先级

prio=5表示线程的优先级, 即Thread#getPriority

2.2 NID & TID

tid=0x00007fe05d003800 nid=0x1a03

  • NID即Native thread ID, 与操作系统相关. 在Linux平台中指的是”light-weight process”. Linux下可以通过如下命令查看一个进程的线程ID:
    ps -T -p $PID // SPID列
    top -H -p $PID // PID列
  • TID为C++层次的线程/Java线程地址.

2.3 Thread State

java.lang.Thread.State: TIMED_WAITING: 当前线程状态, 是一个枚举类. 可以直接看注释. 只有以下几种:

- NEW
- RUNNABLE
- BLOCKED
- WAITING
- TIMED_WAITING
- TERMINATED

除了基本的状态之外还有一些更详细的描述信息sleeping,waiting on condition [0x000070000c34b000]. 如果发生死锁等事件也将在此体现.

3. Best Practice

3.1 寻找哪里阻塞

背景

在部署一个Web应用后, 向其发送请求, 但是一直没有响应. 最后查看Nginx日志查看上游超时了, 但是应用没有抛出异常.

排查

Dump的线程信息有一条内容如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
http--2801-4$1476360005" #25 daemon prio=5 os_prio=0 tid=0x00007fa884011000 nid=0x6b67 runnable [0x00007fa87f532000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
- locked <0x00000000910e2f20> (a java.net.SocksSocketImpl)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at java.net.Socket.connect(Socket.java:538)
at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
- locked <0x00000000910e2e98> (a sun.net.www.http.HttpClient)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
at sun.net.www.http.HttpClient.New(HttpClient.java:308)
at sun.net.www.http.HttpClient.New(HttpClient.java:326)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1202)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:966)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1316)
- locked <0x00000000910e2358> (a sun.net.www.protocol.http.HttpURLConnection)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1291)
- locked <0x00000000910e2358> (a sun.net.www.protocol.http.HttpURLConnection)
at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:298)
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:166)
at com.sun.proxy.$Proxy45.getSimilarity(Unknown Source)
at com.coola.component.search.share.SearchClientNew.getSimilarity(SearchClientNew.java:78)
at com.zhenai.website.modules.recommend.service.impl.RecommendServiceImpl.getGuessLike(RecommendServiceImpl.java:305)
at com.zhenai.website.action.profile.ProfilePageAct.profileIndex(ProfilePageAct.java:524)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:215)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:132)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:104)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandleMethod(RequestMappingHandlerAdapter.java:745)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:685)
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:80)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:919)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:851)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:953)
at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:844)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:118)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:829)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:95)
at com.caucho.server.dispatch.ServletFilterChain.doFilter(ServletFilterChain.java:103)
at com.zhenai.website.common.web.interceptor.OfflineFilter.doFilter(OfflineFilter.java:141)
at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:87)
at com.zhenai.commons.web.filter.SidFilter.doFilter(SidFilter.java:77)
at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:87)
at com.zhenai.commons.web.filter.XssFilter.doFilter(XssFilter.java:58)
at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:87)
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:88)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:106)
at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:87)
at com.zhenai.commons.web.filter.IpBlackFilter.doFilter(IpBlackFilter.java:69)
at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:87)
at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:187)
at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:265)
at com.caucho.server.http.HttpRequest.handleRequest(HttpRequest.java:273)
at com.caucho.server.port.TcpConnection.run(TcpConnection.java:682)
- locked <0x00000000b5d04cd8> (a java.lang.Object)
at com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:743)
at com.caucho.util.ThreadPool$Item.run(ThreadPool.java:662)
at java.lang.Thread.run(Thread.java:745)

可以很容易看出来, 是某个Http请求阻塞了.:

1
at com.coola.component.search.share.SearchClientNew.getSimilarity(SearchClientNew.java:78)

原因是调用其他部门提供的Http接口连接不上.

有同事分享了一个相似的场景:
API调用Dubbo超时, 查看Dubbo日志发现线程池占满了. 查看线程信息后发现原因也是一个没有设置超时的HTTP请求阻塞.

3.2 查看CPU占用最高的线程(Linux)

输入top命令后按照CPU占用排序.

  • Linux上是输入大写P.
  • Mac上是输入o, 再输入cpu后按enter键

PID为1707. 继续查看CPU占用最大的线程, top -H -p 1707. 结果如下:

查看线程ID为1751对应的Java线程. 上文提到过, 这个线程对应的是结果中的NID. 但是top显示的是10进制, 而stack是16进制的. 1751的十六进制格式为0x6d7. 可以通过jstack查看对应的Java线程 jstack 1707 | grep 0x6d7即可看到是哪些Java代码造成的了:

1
"ReputMessageService" #33 prio=5 os_prio=0 tid=0x00007ffb60743fa0 nid=0x6d7 waiting on condition [0x00007ffa6ddfa000]

See Also